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Abstract 

Although perfectly scalable Items rarely occur in practice, 
Guttman's concept of a scale has proved to be valuable to the 
development of measurement theory. If the score distribution is 
uniform and there is an equal number of items at each difficulty level, 
both the elements and the eigenvalues of the Pearson correlation matrix 
of dichotomous Guttman^scalable items can be expressed as simplo 
functions of the number of items. Evea when these special conditions 
do not hold, the values of the correlations can be computed easily by 
assuming a particular score distribution. These findings are useful in 
conducting research on the properties of scales. 
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Guttman (1941, 1950a) developed the concept of an idealized 
type of attitude scale with the following property: "Persons who 
answer a given question favorably all have higher ranks on the scale 
than persons who answer the same question unfavorably* From a 
respondent's rank or scale score we know exactly which iteas he 
endorsed" (Suchman, 1950, p* 9)« Although sets of items that 
follow this pattern rarely occur in practice, the concept of 
Guttman^scalable items has proved to be useful in the development of 
measurement theory. The properties described in this paper apply 
only to dichotomous Guttman items, though Guttman* s theory comprises 
items with multiple score categories. To simplify the discussion, it 
will be assumed that these are cognitive items that are either correct 
or incorrect. For the case of cognitive rather than attitude items, the 
analogue of the scalability property described above is that items can 
be ordered according to difficulty such that individuals who answer a 
given item correctly also answer all previous items correctly* 

One well-known property of dichotomous Guttman items 
is that, for n items, no two of which have the same marginals, the 
Pearson (phi) correlation matrix is of rank n (e.g. Torgersou, 1958, 
p. 312), despite the fact that the items can be ordered along a 
single dimension. It can be demonstrated that under certain uniformity 
conditions, both the elements and the eigenvalues of the Pearson 
correlation matrix, ^ can be expressed as simple functions of the 
number of items. These results are closely related to Guttman *s 
(1941, 1950c) findings on the principal components of scale 
analysis. Using a method that is now known as multiple 
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correspondence analysis, Guttman obtained the latent structure of a 
transformation of the matrix of Iter responses. He did not, 
however, express the eigenvalues as simple functions of the number 
of Items, nor did he point out the relation between the latent 
structure he derived and that of the phi matrix. 

An understanding of the structure of the Pearson correlation 
matrix of Guttman Items Is of value In conducting research on the 
properties of scales. For example. In Investigating methods of 
dim nslonallty assessment for dlchotomous data. It Is useful to 
determine the results of applying potential methods to Guttman Icems. 
It Is advantageous to be able to generate the desired correlation 
matrices without generating the Item responses themselves. A general 
form for the elgenvaliKS of the Pearson correlation matrix Is also 
useful; these eigenvalues can be regarded as a standard to which the 
roots of other proximity matrices can be compared. 

Notatlonal Scheme 

Table 1 gives a schematic representation of admissible response 
patterns, called a scalogram, for a set of n "types" of Guttman- 
scalable dlchotomous Items. In Guttman *s terminology. Items with the 
same marginal distribution (proportion correct) are said to be of the 
same type. The n + 1 rows of Table 1 correspond to the n + 1 
permissible response patterns. The first n columns correspond 
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Table 1 
Indicator Matrix for 
Admissible Response Patterns 
for n Dlchotomous Guttman Items 

Frequency 



Incorrect Responses Correct Responses of 

Item 1 2 n 1 2 n Row Total Respondents 

1 1 1...1 0 0...0 n fl 

2 0 1...1 1 0...0 n f2 
Response 3 0 0...1 1 l.... n f3 
Pattern • • ••••• • ••••• • • 

• • ••••• • ••••• • • 

• • ••••• • ••••• • • 

n + 1 0 0 ... 0 1 1 ... 1 n fn+i 

Column Total 1 2 ... n n n-1 ... 1 n(n+l) 

Frequency of 2 n n+1 n+1 n+1 

Responses ^1 I f i • • • I f i I f i I f i • • • ^n+l ^ ' I H 

Responses j__2 i-3 i.l 



9 



Page 5 

to incorrect reponses to the n Items; the next n columns correspond 
to correct reponses. In the body of the table, ones indicate cells 
in which observations occur, according to the definition of a Guttman 
scale; zeroes indicate cells in which it is impossible for 
observations to occur. The row and column totals shown on the inner 
margins of the table are the totals of the indicator variables. The 
outer margins of Table 1 give the number of subjects for each row and 
column of the table. The notation f^ represents the number of 
subjects giving response pattern i. 

To simplify the presentation in this paper, the following 
assumptions are made: 

1. There is a uniform number of items per type. Letting \ 

denote the number of items of type k, k - 1, 2, ... n, this assumption 
can be expressed as h]^ ■ h2 ■ ... ■ h^ ■ h. The results 
presented here concerning the elements and eigenvalues of hold 
regardless of the value of h, provided that it is constant for all 
types. Therefore, it can be assumed without loss of generality that 
h ■ 1; that is, there is only one item per type. Because two 
Guttman-scalable items with the same proportion correct must have a 
Pearson correlation of 1, another way of stating this assumption is 
that no two items are perfectly correlated. 

2. The frequencies are the same for each response pattern, i.e., 

fl ■ f2 ■ ... ■ f^+i ■ f, (For Guttman items, there is a one-to-one 
correspondence between response patterns and number-right scores. 
Therefore, another way of stating this condition is that the frequencies 

ERIC 



Table 2 



Response Frequencies for Two Dlchotomous 
Guttman Items, 1 and j (1 < J) 
Under Uniformity Conditions 



Item J 

Correct (1) Incorrect (0) Total 
Correct (1) n+l-j j-i n+1-1 

Incorrect (0) 0 1 1 

n + l- J J n + 1 
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are the same for each number-right score.) This assumption and the 
assumption that all hj^ « h are referred to jointly as the uniformity 
conditions* Because the value of f does not affect the validity of the 
results presented here, we can assume f » 1* Based on the uniformity 
conditions and the further assumptions that h » 1, f « 1, the indicator 
variables and inner margins of Table 1 can then be treated as 
frequencies* 

The Pearson Correlation between Outtman Items 
The phi correlation between two items, i and j, can be expressed as 

^11^22 - ^12^21 

♦ij - ^ [1] 
^^^+1^+2^1+^2+ 

where f^.^ represents the frequency in the r,c cell of the 2x2 table 
of responses to a pair of items, r-1, 2;c»l, 2; f^^ is the 
marginal frequency for column c and fj.+ is the marginal frequency for 
row r* For a pair of Guttman items, the frequencies in the 2x2 
table are au ynown in Table 2 for two items, i and j, where i < j 
(that is, item i is easier than item j)* The number of subjects who 
get both items right is equal to the total number of subjects, n + 1, 
minus the index of the harder item, j* Tlie number of subjects who get 
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item i correct and item J wrong Is simply the difference between their 
indices, J -* i* Because i is the easier item, the number of subject 
who answer both items incorrectly is equal to i* Finally, there can 
be no subjects who get item i wrongs but get item J correct. Using 
the computational formula in equation 1, the phi coetficient for items 
i and J can be computed as 

♦i. - <" ^ 1 - 3><^> [2] 

/(n + 1 - J)(J)(n + 1 - i)(i) 



i (n + 1 - j) 
J (n + 1 - i) , i < J 



An alternative derivation of equation 2 can be obtained by observing 
that, in a Guttman scale, the inter-item correlation is the maximum 
that can be achieved, given the marginal distributions for the 
items* The maximum phi coefficient that can be obtained from items 
with proportions correct pi and pj is 



Max ((j>ij) - T /^J " y. „ . „ [31 
•^J f Pj (1 - Pj) f Pi > Pj • 

(Lord and Novick, 1968, p. 347). Guttman (1950b, p. 203) expresses 
the correlation between scalable items in a an equivalent form. 
This equation applies even if the uniformity conditions do not hold. 
Now, by noting that, under the uniformity conditions, the proportion 
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correct for the i^^ item in a Gutttnan scale can be expressed as pi » 
(n + 1 - i)/(n + 1), equation 2 can be obtained from equation 3. 

It Is useful to note that the expression for the correlation of 
the tlrst with the n^^ Item can be simplified: Letting 1-1 and 
J " n In equation 2, 



y n (n + 1 « 1) 



[A] 



n^ n 



Equation 2 can also be simplified slightly for the case of adjacent 
Items. Letting J > 1 + 1, equation 2 becomes 



A i /l (n - 1) 

♦^•^•^1" * 1) (n * 1 - 1) 



[5] 



The correlation of the first with the second Item and the 
second-last with the last Item can both be simplified further. For 
1-1, J ■2orl»n-l, J -n, equation 5 becomes 



♦ l,2 - ♦n-l,n - y^^ST^ 



[6] 



Thus, for a given number of Items, these two "border" correlations 
are always equal. In fact, because the correlation matrix satisfies 
the definition of a simplex (Guttman, 1954, p. 274), It Is symmetric 
with respect to Its minor, as well as Its major diagonal. 
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Eigenvalues of the Pearson Correlation Matrix 
A scalogram for three Guttman Items Is given In Table 3« 
Because of the assumptions h - 1, f - 1, Table 3 Is also a frequency 
table. Under the uniformity conditions, the correlations for n ■ 3 
Guttman Items can be found from equations 4 and 6 to be ^1^2 " 
♦ 2,3 ■ *^l/3 and ^1^3 - 1/3 • This can be verified by computing the 
correlations directly from Table 3. The eigenvalues of this matrix are 
found to be 2, 2/3, and 1/3 • To obtain an expression for the 
eigenvalues In terms of n, we can express the correlations as In 
equations 4 and 6 and then obtain a cubic equation for the eigenvalues, 
Xi, In terms cf n. We find that the cubic equation can be factored as 
follows: 

[X - (n + l)/2] IX - (n + l)/6] [X - (n + 1)/12] - 0 
The roots can be expressed more generally as 

Xi - (n + 1)/[1(1 + 1)], [7] 

a result that holds for any n.l The smallest eigenvalue Is thus 
Xn - (n + 1)/ [n (n + 1)] - 1/n; the largest Is Xi - (n + l)/2 . Note 
that the proportion of variance attributable to the first principal 
component Xi/n » (n + l)/2n, approaches 1/2 as n approaches 
Infinity, a somewhat surprising result. 
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Table 3 
Frequency Table for 
Three Dichotomous Guttman Items 

Incorrect Responses Correct Responses 



Item 1 2 3 1 2 3 Row Total 

Response 1 111000 3 

Patterns 2 011100 3 

(Subjects) 3 0 0 1 1 1 0 3 

4 _0 g g 111 3 

Column Total 1 2 3 3 2 1 12 
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In discussing the principal components of scale analysis, 
Guttman (1941, 1950c) does not analyze the correlation matrix* 
Instead, he derives the latent structure of related matrices that 
are transformations of the matrix of item responses* In the case of 
n dichotomous items, his G matrix (1941, p* 331) is of dimensions 
2n X 2n and Cfin be expressed as 

G - D^^^^(S'S - ^ cc") D~^^^ [8] 

where S denotes the (n + 1) x 2n matrix of item responses (e«g«. Table 

3), c is the 2n x 1 vector of column frequencies of S, F - n + 1 is the 

-1/2 

number of subjects, and D ^ is the diagonal matrix of reciprocal 

square roots of these column frequencies* (For f > 1, the number of 
rows of S would be expanded so that there were f rows for each of 
the n + 1 response patterns* The sample size F - f(n +1) would be used 
in Equation 8* The dimensions of G would be unchanged* For h > 1, 
the number of columns of G would be expanded so that there were 2h, 
instead of 2, columns for each type of item* In this case, G would 
be of dimensions 2hn x 2hn*) A general element of G can be expressed 
as 




where Fp and Fq denote the number of individuals in columns p 

and q, respectively, of S (p, q « 1, 2, ***2n), and Fpq denotes the 

number of individuals who are represented in both columns p and q of 
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S. Guttman states that "this clement Is recognized to be precisely 
that used in the chi-square test of significance of association 
between two attributes" (Guttman, 1941, p. 332). More 
specifically, the Pearson chi-squared statistic for a pair of items 
is equal to the sum of squares of the four appropriate elements 
gpq of G (see Equation 11), multiplied by the sample size, F. 
Noting that, for any two items represented in the scalogram, 

^ij " ^♦ij • [10] 
we observe that the elements gpq might be described more precisely 
as components of (|>2. in fact, the relation between the elements 
of the Pearso#liatrix and the elements of G can be expressed as 

For illustration, let us use Equation 11 to calculate ^1^2 from 
G for the data of Table 2. The G and $ matrices corresponding to 
Table 3 are given in Table 4. ()>i^2 can be obtained from the 
elements of G as follows: 

'1,2 ^*12 ^ ^5 + + 845 ) 

- [.35^ + (-.35)2 ^ (-.20)2 + .2o2]^/2 

- [2(.352 + .202)]l/2 . 
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Table 4 
^ and G Matrices for 
n - 3 Guttman Items under 
Uniformity Conditions 









1.00 


.58 


.33 


.58 


1.00 


.58 


.33 


.58 


1.00 




G 





.75 


.35 


.14 


-.43 


-.35 


-.25 


.35 


.50 


.20 


-.20 


-.5 


-.35 


.14 


.20 


.25 


-.08 


-.20 


-.43 


-.43 


-.20 


-.08 


.25 


.20 


.14 


-.35 


-.50 


-.20 


.20 


.5 


.35 


-.25 


-.35 


-.43 


.14 


.35 


.75 




2 


2 




2 


2 



♦ij = (glj + gi,j+n + gi+n,j + gi+n,j+n) 
2 2 1/2 
= [2(gij + gl+n,j+n)] ' 
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Because G has two rowo and columns corresponding to each item, one 
for the correct response and one for the incorrect response, it may 
be regarded as a redundant means of expressing the Pearson matrix, 

G has n non-zero roots, which are identical to those of ^ • 
Guttman (1950c) finds the latent structure of two transformations, A 
and B (pp. 338-339), of S that differ slightly from G. Because A 

<• urn ^ " m ^ 

and B ace the minor product moment and major product moment 
matrices, respectively, of a rescaled version of S, their non-zero 
roots are identical. There are n roots that are 1/n times the roots 
of ^ or G, as well as an extraneous coot of !• The non-trivial 
roots of A and B are interpretable as squared correlation ratios* 
The largest non-trivial root of A is equal to the maximum value of 
the ratio of variance between categories (between columns of S) to 
total Bcore variance, obtained by assigning scores to subjects (rows 
of S) in an optimal fashion. Similarly, the largest non-trivial 
root of B is the maximum value or the ratio of variance between 
subjects (rows of S) to total variance, obtained by assigning 
weights to categories (columns of S) in an optimal way. The 
succeeding roots are the maximum squared correlation ratios for the 
residualized matrices. Analysis of multiway contingency tables 
through derivation of the eigenstructures of transformed response 
matrices such as G, A, and B is now commonly referred to as multiple 
correspondence analysis (see Tenenhaus and Young, 1985, for an 
extensive review and Zwick and Cramer, in press, for an illustration 
of the relation between this approach and other multivariate 
techniques in the case of a two-way contingency table). 

20 
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Under certain uniformity conditions, the elements and 
eigenvalues of the Pearson correlation matrix for dlchotomous 
Guttman Items can be expressed as simple functions of th^ number of 
Items. These relations may prove to have applications In the 
Investigation of the properties of scales. For example, In 
conducting research on the dimensionality of dlchotomous data, It Is 
of Interest to determine the results of applying potential methods 
of dimensionality assessment to perfect Guttman scales. A method 
cannot be considered acceptable If It Is known to produce the wrong 
answer for dlchotomous Items. Equations 2, 4, 5, and 6 allow the 
generation of the desired correlation matrices without generation of 
the Item responses themselves. Equation 7 may also be useful; 
eigenvalues of possible transformations of ^ (e.g., see the section on 
Image analysis In Zwlck, 1986) or of other proximity matrices can be 
compared to "baseline" values obtained from Equation 7. 

It Is Important, however, to recognize the effect of relaxing the 
assumption that all f{^ » f. (The assumption that all h{^ « h » 1 Is not 
Implausible; furthermore. If two Items arc perfectly correlated, one 
can be discarded vilthout loss of Information.) The results of allowing 
the f{^ to be unequal can best be demonstrated by example. Suppose once 
again that n = 3, producing n + 1 « 4 response patterns. It Is likely 
that the Intermediate response patterns, 2 and 3, will be more common 
than 1 and 4, which represent all-Incorrect and all-correct patterns, 
respectively (see Table 3). Let us assume a simple model In which 
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€4 ■ fi and f2 ■ f3 ■ kf^, *»here k is a positive integer • The values 
of the correlation coefficients and the largest eigenvalue, X^, of ^ 
are given in Table 5 for selected values of k. The correlation 
coefficients can be computed by first noting (e.g., from Table 1) that 
under the hypothesized model, the proportions correct for the three 
items are Pi - (1 + 2k)/(2 + 2k), P2 - (1 + k)/(2 + 2k) - 1/2, and P3 - 
1/(2 -f 2k) • Then by application of equation 3, the correlation 
coefficients are found to be " ^23 " /1/(1 + 2k) and ^^3 ■ 1/(1 + 
2k) (The values of Xj in Table 5 were obtained numerically ) A value 
of k ■ 1 corresponds to the case discussed above, in which all fj^ « f • 
It is clear that as k increases and the distribution of subjects becomes 
more peaked, the inter-item correlations become smaller* This is not 
surprising when we consider that, for a fixed scale score, Guttman items 
are independent of one other. In fact, it is easily verified that, for 
any pair of adjacent score patterns, any two Guttman items are 
independent* By making k larger, we are increasing the proportion of 
subjects whose scale scores are 1 or 2* By the time we reach k « 100, 
begins to resemble the identity matrix* In short, the properties of 
the correlation matrix of dichotomous Guttman items can be affected 
substantially by the distribution of subjects* 
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Table 5 

Values of Correlation Coefficients (^ij) 



and 


Largest Eigenvalue (X^) 


for 




Selected Values 


of kl 




k 


♦l,2 - ♦2,3 


n,3 


Xl 


1 


.58 


.33 


2.0 


2 


.45 


.20 


1.7 


10 


.22 


.05 


1.3 


100 


.07 


.01 


1.1 



It is assumed that n = 3, f4 « f^, and £2 " f3 ■ kf^* 
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Of particular interest is the case In which scores follow a normal 
distribution. For n - 8 items, Table 6 shows the proportions correct 
for each item and the eigenvalues of ^ under two conditions: (a) a 
uniform distribution of numoer-rigbt scores, as above, and (b) an 
approximately normal distribution of number-right scores, created by 
setting fi - f9 - 4, f2 - fs - 7, f3 - fy - 12, f4 - fe - 17, and fs - 20. 
Under the normal conditions, the proportions correct for the items are no 
longer equally spaced, as they are under the uniform conditions. The 
largest eigenvalue is smaller than in the uniform case; the remaining 
roots are larger. The results for the normal case, as well as the results 
for k ■ 2 in Table 5, show that substituting a more realistic score 
distribution for the uniform distribution does not result in an increase 
in the size of the first eigenvalue. In fact, the size of the first root 
is largest for U-shaped distribution; that is, when ther'i are more 
subjects in the extreme score patterns and fewer in the intermediate 
patterns. 

If the researcher wishes to know the size of the correlations and the 
roots of the correlation matrix for n Guttman items with a symmetric 
score distribution, the values obtained from Equations 2, 4, 5, 6, and 7 
may be adequate estimates. An alternative is to assume a specific score 
distribution and apply procedures analogous to those used to obtain the 
values in Table 5. 
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Footnote 

^In an appendix to an article on factor analysis of Guttman-scalable 
Items, Burt (1953, p« 21) gives the same expression for the 
"f actor-*varlances" obtained by analyzing a matrix of "product moment 
coefficient (sj applied to the data after they have been transformed 
to standard measure" (1953, p« 11)« The scores he describes are 
obtained from the Items by subjects matrix of 0-1 Item responses by 
multiplying by n and then centering each row* Burt makes a clear 
distinction between a correlation based on standardized scores and a 
"product-moment correlation for a twofold point distribution (<|))" 
(Burt, 1950, 169; 1953, p. 20), In fact. If correct responses 
to an Item, k, are assigned a score of a|^ and Incorrect responses a 
score of b|^, then <|>ij Is Invariant across all possible values of aj^ 
and b^^f k ■ 1, J, The correlations described by Burt are therefore 
Identical to phi coefficients* Because his factor-variances are 
obtained through principal component analysis of the correlation 
matrix, they are Identical to the eigenvalues of ^. 
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